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. A predictive model of terrorist activity is developed by examining 

jrt ' the daily number of terrorist attacks in Indonesia from 1994 through 

2007. The dynamic model employs a shot noise process to explain 
the self-exciting nature of the terrorist activities. This estimates the 
^O ' probability of future attacks as a function of the times since the past 

attacks. In addition, the excess of nonattack days coupled with the 
presence of multiple coordinated attacks on the same day compelled 
the use of hurdle models to jointly model the probability of an attack 
day and corresponding number of attacks. A power law distribution 
with a shot noise driven parameter best modeled the number of at- 
tacks on an attack day. Interpretation of the model parameters is 
Cy . discussed and predictive performance of the models is evaluated. 

1. Introduction. Since the attacks of September 11, 2001 there has been 
a substantial increase in the amount of resources dedicated to combating ter- 
rorism. As a single example, the Congressional Research Services estimates 
qq ■ that as of September 2010 $1.2 trillion USD has been spent on the Global 

^sO ! War on Terror, including $28.5 billion USD on enhanced security [Belasco 

(2010)]. Despite this increase in spending, there are no accurate measures of 
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the effectiveness of these counter-terrorism efforts [Lum, Kennedy and Sher- 
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ley (2006), Perl (2007)]. According to the Congressional Research Service, 
one of the keys to combating terrorism effectively is to understand, and have 
correctly specified models for, terrorist activity: 



"Better understanding of the dynamics of terrorism allows for a more complete 
r> ■ picture of the complexities involved in measuring success or failure and can 

— < | assist the 110th Congress as it coordinates, funds, and oversees anti-terrorism 

policy and programs" [Perl (2007)]. 

A dynamic model for terrorism should describe terrorist activity in a way 
that is consistent with both the theoretical framework for terrorism and the 
observed data. Defining accurate models of terrorist activity is important not 
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just from the standpoint of assessing the effectiveness of counter-terrorism 
efforts but also for use as a tool to predict the future risk of terrorist attack. 

A majority of publications in the literature use one of three basic model- 
ing approaches to terrorism. The work of Enders and Sandler (1993, 2000, 
2002, 2006) and Barros (2003) uses time series analysis techniques including 
intervention analysis and vector autoregressive (VAR) models. Other more 
recent papers have applied group based trajectory analysis [Nagin (2005)] to 
analyze terrorist data. Examples of this include LaFree, Morris and Dugan 
(2010) who incorporate a zero-inflated Poisson distribution to account for an 
excess of zero counts. Other approaches use Cox proportional hazards mod- 
els to directly model the time between attacks [Dugan, LaFree and Piquero 
(2005), LaFree, Dugan and Korte (2009)]. These last two papers attempt to 
account for the recent event history as influencing current events by con- 
structing an ad hoc term that describes the recent density of attacks. While 
this is a step toward incorporating the dynamics related to the process his- 
tory, it requires prespecification of the form of dependence. 

The desire to include terms to account for recent event history is rooted 
in the theoretical understanding of terrorism and other politically moti- 
vated violence. The clustering hypothesis, concurrent with contagion theory 
[Midlarsky (1978)], explains politically motivated violent activity as a se- 
ries of nonindependent events, where each event influences the probability 
of subsequent events. This hypothesis is well accepted in criminological and 
sociological approaches to terrorism. Clustering effects and contagious be- 
havior have been demonstrated in military coups [Li and Thompson (1975)], 
international terrorism [Midlarsky, Crenshaw and Yoshida (1980)], airline 
hijackings [Holden (1986)], race riots [Myers (2000)] and insurgent activity 
[Townsley, Johnson and Ratcliffe (2008)]. 

In examining terrorism data in general there are two major issues to 
address. The first is that as terrorist attacks are usually rare, the daily num- 
ber of terrorist attacks are often zero, but due to large coordinated attacks 
there are some extreme values. This combination of a large number of zeros 
and extreme values is poorly modeled by standard probability distributions. 
Second, the timing of terrorist incidents appear to be clustered, rendering 
standard models that assume independence unsuitable. 

The first of these issues is addressed using a hurdle model [Mullahy 
(1986)], also known as the two-part model [Heilbron (1994)]. The hurdle 
model is a two component model that allows separate specifications of the 
probability of a zero count and the probability of a nonzero count. This al- 
lows the hurdle model to accommodate a large number of zeros in addition 
to some extreme counts. The hurdle model is used in a variety of applica- 
tions, including in ecology for modeling counts of rare species [Welsh et al. 
(1996)], in public health for modeling smoking behavior [Jones (1994)], in 
political science for modeling proportional representation in minority elec- 
torates [Marschall, Ruhil and Shah (2010)] and network change detection 



SELF-EXCITING HURDLE MODELS FOR TERRORIST ACTIVITY 3 

[Heard et al. (2010)]. For terrorism modeling, the hurdle model is preferred 
to the zero-inflated model [Lambert (1992)] which assumes that the extra ze- 
ros observed are due to censoring. The hurdle model assumes that the extra 
zeros are due to a separate process (the "hurdle"), which must be overcome 
before the number of corresponding incidents are determined. This is more 
reasonable for terrorism data, as it can often be assumed that sparsity of 
attacks is because they are indeed rare, not because they are unobserved. 

The second issue regarding the clustering behavior of terrorist activity is 
addressed by incorporating a self-exciting component [Hawkes (1971)]. The 
self-exciting component specifies that the probability of a event is a function 
of the time (and possibly other aspects) of all previous events, such that the 
effect on the probability decreases over time. This can help account for the 
clustering and dynamic nature of terrorism. For example, Holden (1986, 
1987) developed a self-exciting Poisson model that better represented the 
dynamics of airline hijacking and Mohler et al. (2011) a self-exciting space- 
time model for residential burglaries. 

This paper combines these two concepts to examine the use of self-exciting 
hurdle models for terrorist activity in Indonesia and Timor-Leste between 
1994 and 2007. The hurdle component is modeled as a self-exciting Bernoulli 
process where the form of self-excitation is dictated by the data. The nonzero 
counts are modeled by an extreme value distribution with parameters that 
are also a function of the event history. The corresponding self-exciting hur- 
dle model is capable of making predictions and providing information about 
the risk of terrorist activity without any additional covariate information 
other than the event history. This capability is important as covariate infor- 
mation for terrorism data is often either missing or unreliable. 

2. Data and exploratory analysis. The Global Terrorism Database [La- 
Free and Dugan (2007)] is an open-source publicly available database of 
over 87,000 terrorist events around the world from 1970 through 2008. The 
database is continually updated with new information and is believed to be 
the most comprehensive database of its kind. The data used here are a subset 
of the GTD consisting of daily counts of terrorist attacks in Indonesia 1 from 
January 1, 1994 through December 31, 2007. 

2.1. Exploratory data analysis. The data from a training period of 1994 
through 2000 was examined to inform model construction. Of the 2,557 days 
considered in this analysis, there were 250 terrorist attacks on 158 unique 
event days. Figure 1, displaying the daily and semi- yearly counts of attacks, 
illustrates the nature of the terrorist activity. In particular, the attacks ap- 
pear clustered with long stretches between attacks followed by a period of in- 



1 This includes data from Timor-Leste (East Timor) which became a sovereign state in 
2002. 
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Fig. 1. Daily number of terrorist events in Indonesia. The histogram shows the cumula- 
tive number of attacks in each six month period. 



creased activity. In addition, while most days have no attacks (93.8%), some 
days have multiple attacks (possibly due to planned coordinated attacks). 

Table 1 shows the distribution of the observed attacks per day compared 
to the expected values under a Poisson and negative binomial probability 
distribution fitted with maximum likelihood. While the Poisson cannot cap- 
ture any of the tail behavior, the negative binomial adapts better to the 
tail distribution but consequently underestimates the number of days with 
only 1 attack and overestimates the number of days with 2 and 3 attacks. 
A chi-square goodness-of-fit test yields a p- value of 0.00016, suggesting that 



Table 1 
Distribution of observed and expected number of Indonesian terrorist attacks per day 

(1994-2000) 



# Attacks 



#Days 



Poisson 



Neg.Bin 




1 
2 
3 
4 
>4 

AIC 



2,399 

130 

16 

7 
1 
4 



2,319 

227 

11 







1,988.0 



2,401 

103 

31 

12 

5 

5 

1,481.; 
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the negative binomial is not a sufficient model. In addition, the tail is com- 
prised of several extreme values (36, 11, 10 and 6 attacks on a single day) that 
exacerbate the general lack of fit. The large number of zeros coupled with 
the presence of several extreme values necessitate more complex models. 

Figure 1 also reveals the possibility that the attack days are clustered. 
This is examined with Ripley's K-function [Dixon (2002)]. The i^-function 
is a second-order function that can reveal if clustering (or inhibition) is 
present in a point pattern. It is defined loosely as 

K(t) ex E[$: of future events < t from a randomly chosen event]. 

Because this measure will be effected by an inhomogeneous attack day proba- 
bility, we estimate it [Baddeley, M0ller and Waagepetersen (2000), Veen and 
Schoenberg (2006)]: 

A? 



ftr t \ = T -isr?0iy*- Htj-tj^t) 



, Pi rr- Pj 

where T is the total observational time, f>i is the estimated probability of 
at least one attack at day £$, and wi is a one-sided edge correction. The 
estimated probability is given by the model in (9) that includes trend and 
seasonality components (i.e., model BL5 in Table 2; see Section 4). 

Figure 2 plots Kit) — t which has expected value if the estimated proba- 
bilities are correct and there is no unaccounted for clustering. The observed 
values fall well above the 95% pointwise confidence intervals (obtained from 
a parametric bootstrap, 1,000 simulations), showing that the data display 
clustering behavior not accounted for by the baseline model. 

These characteristics of the terrorism data require flexible models that 
can handle the temporal attack patterns, clustering, an excess of nonattack 
days and the possibility of multiple coordinated attacks on the same day. 
Self-exciting hurdle models are developed in the next section to represent 
the complex nature of such terrorist activity. 

3. Hurdle models and the self-exciting process. 

3.1. Hurdle models. The hurdle models of Mullahy (1986) refer to a class 
of two component models for discrete count data. These models assume 
that two different processes drive the zero and nonzero counts, respectively. 
The hurdle component of the model corresponds to the probability that the 
count is nonzero (i.e., a terrorist action will occur), while the count com- 
ponent corresponds to the distribution of positive counts (i.e., the number 
of corresponding terrorist events). When the two components are combined, 
the hurdle model produces a probability mass function on the nonnegative 
integers. Additionally, the hurdle approach facilitates the use of two sim- 
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Fig. 2. Plot of K(t) — t for Indonesian terrorism (1994-2000). The solid line is from the 
baseline model (BL$) with the gray band showing the 0.025 and 0.975 quantiles estimated 
from 1,000 simulations. The dashed line is from the self-exciting model (SE±) presented 
in Section 4- 



pier models in place of one more complicated model. This will be especially 
appealing in model fitting, as the likelihood may be separable, allowing for 
independent evaluation of each component. 

For modeling the daily counts of a terrorist process, let Yt be the number 
of events on day t and Et be the indicator for an event day such that Et = l 
if Yt ■> 1 (he., there is at least one terrorist attack on day t) and Et = if 
Yt = (i.e., there are no attacks on day £). Also, let £j denote the ith event 
day (not event, but day with at least one attack). 

Letting Tit = {Y s :s < £} be the internal history of the terrorist process, 
the hurdle component is modeled as a Bernoulli random variable with hurdle 
probability p t = Pr(£' i = \\H t -{) specified conditionally on the past history 
of the event process. A separate count model is constructed for Yt\Et = 1 
with density ft{y) = Pr(Yj = y\E t = \,%t-i)i for y G {1,2, .. .}. Because the 
count component is constructed conditionally on there being at least one 
event, it will have no support on [i.e., /t(0) =0]. Combining the hurdle 
and count components gives the full density 



(1) 



fZ(y)=Pr(Y t = y\Ut-i) 



ft{y)pt, 
i -pt, 



y>o, 
y = o. 



This has a similar form to a zero-inflated model [Lambert (1992)] but 
differs in that for the zero-inflated model Pr(Y"j = 0|%t-i) = (1 —Pt)ft(0)- 
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Zero-inflated models usually arise when there is a random censoring pro- 
cess that prevents observations at certain times. Alternatively, instead of 
assuming that terrorist events occurred but were not observed, the hurdle 
approach explicitly models the probability that no events occurred. 

The log-likelihood for a hurdle model can be decomposed as the sum of 
two terms logL = logLi + logL2, where for T observations 

T 

(2) log L 1 = Y J E t \ogp t + (I -E t )\og{l-p t ), 

t=i 

T 



log L 2 = ^2 E t - log f t (y t ) 
(3) 



t=i 

Yl l °sft(yt) 



t:E t =l 

The first term (in the form of a Bernoulli process) represents the hurdle 
component and the second term is the usual sum for a set of observations 
coming from the density ft and represents the number of events per event 
day. The form of the likelihood is equivalent to a discrete time version of 
a marked point process [Daley and Vere- Jones (2003)], where the marks are 
the number of attacks on an event day. A particular benefit of this represen- 
tation emerges when ft and pt share no common parameters, allowing their 
log-likelihoods to be handled separately for parameter estimation. 

3.2. Self-exciting process. It has been suggested that some terrorist pro- 
cesses exhibit self-excitation or contagion behavior [Holden (1986), Dugan, 
LaFree and Piquero (2005), LaFree, Dugan and Korte (2009)]. A self-exciting 
point process [Hawkes (1971)] is one where the realization of events increases 
the short-term probability of observing future events, much in the same 
manner that one contagious individual can infect other individuals (while 
they are still infectious) or how major earthquakes lead to aftershocks. This 
type of model can be written in the form of a cluster process [Hawkes and 
Oakes (1974)] and used to explain the apparent clustering of terrorist activ- 
ities. Specifically, we consider a generalized shot noise process [Rice (1977)], 
where the self-exciting component St > will be a nonnegative function of 
the past history Tit-i of the form 

S t = ^2 E s a s9(t ~ s) 

s<t 

(4) 

= ^ a i9(t-U)- 

i:ti<t 

The magnitude parameter, a>i, determines the influence that the ith. event 
day has on the self-exciting process. It may be a function of other associated 
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information about the event day (e.g., number of events, number killed, 
success indicator, group attribution). Based on the results of the exploratory 
data analysis, we restrict our attention to the case where oii > 0, although 
inhibition effects could be obtained if negative values were permitted. 

The decay function g(-) specifies the shape of the excitation based on 
the time since the previous event days. To aid in parameter interpretation 
and estimation, we specify <?(•) to be a proper probability mass function 
with strictly positive support such that g(u) > 0, g(u) = for u < 1, and 
SuLi 9( u ) = 1- This ensures that the influence of an event will eventually 
diminish and makes cxi solely responsible for the magnitude of the effects 
contributed by the ith event. 

Any discrete distribution can be used for the decay function, provided its 
support is limited to the set of positive integers. Standard distributions, like 
Poisson or negative binomial, may require shifting or truncation to avoid 
having support on zero. For example, the (mean specified) shifted negative 
binomial is 

T(r + u-l)( r \ r f n-1 
(5) g{u;n,r)- 



T(r)(u — 1)! \fi — 1 + r) \/x — 1 + r 

which has a mean of \x > 1, and size parameter r > 0. The corresponding 
shifted geometric distribution is given by g(u;fj,,r = 1) and the shifted Pois- 
son density can be defined as the limiting case when r — > oo. 

Figure 3 shows the geometric and negative binomial decay functions along 
with the shot noise processes corresponding to 8 event days. This illustrates 
how the form of the shot noise process from (4) is similar to kernel inten- 
sity estimation [Diggle (1985)], but with weights and one-sided (predictive) 
kernels. 

3.3. Self-exciting Bernoulli hurdle process. The event days, Et, are mod- 
eled as a self-exciting Bernoulli process where the probability of an event 
day is excited (increased) by the occurrence of previous events. Specifically, 
we consider the hurdle probability on day t to be a function of the process 

(6) X t = B t + S t , 

where Bt is a baseline process and St is the self-exciting component given 
by (4). The baseline can be a function of any exogenous variables that could 
have an effect on the process (e.g., social, political or economic conditions 
and events, counter-terrorism efforts, etc.), but it will not include any infor- 
mation coming from the internal history. 

To ensure the hurdle probability pt € [0, 1], a transformation r/(pt) = Xt is 
used. If the baseline process is nonnegative (i.e., Bt > 0), then Xt > and an 
appealing transformation is ij(p) = — log(l —p), corresponding to the hurdle 
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Fig. 3. (a) and (b) show the decay functions for the shifted geometric (fj, = 40) and 
shifted negative binomial (fi = 40, r = 5), respectively, (c) and (d) show the corresponding 
shot noise processes (solid lines) and scaled component kernels (dotted lines) for 8 events 
with oti — a — 0.5. 



probability 

(7) 



Pt = 1 - e 



-Xi 



While other transformations (e.g., logit) could be used, this one provides 
several benefits. It enjoys a likelihood function that is computationally con- 
venient, substituting (7) into (2) obtains the Bernoulli log-likelihood 



(8) 



log Li = ]T log(e**i - 1) - J\Y t 



KU<T 



t=l 



The shot noise component of the second sum (X)t=i %t = Y2t=i ^>t + J2t=i &t) 
can be simplified by recognizing that 

T 



t=\ 



i : U <T 
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where G is the cumulative density function of g. This significantly reduces 
computation, as it is only calculated for the event days and not over all days 
in the observation window. 

This transformation also allows for a straightforward interpretation of the 
parameters in Xt. Notice that pt is also the probability that a Poisson ran- 
dom variable with rate Xt is greater than 0. When events are rare and Xt 
is small, Et can be modeled approximately by a Poisson distribution. Ac- 
cording to the superposition property of the Poisson, the event days arise 
from two sources, the baseline process with rate Bt and the shot noise pro- 
cess with rate St- The rate from the shot noise can be further decomposed 
into its elements ong{t — U) for {i : ti < t}. Thus, under the Poisson approx- 
imation, event day i will generate an expected J2"^=i a id( u ) = a i number of 
additional event days and the decay function <?(•) controls when the extra 
event days will occur. While these interpretations will not hold exactly under 
the Bernoulli model, they do provide an indication of how each component 
effects the process. 

3.4. Survival functions. This formulation for the hurdle process also pro- 
vides simple expressions for some properties of the time until the next event 
day. For a given time to, let r be the time of the next event day. The survival 
function Vt (u) = Pr(r — to > u) gives the probability that the next event 
day will be more than u days away. For the self-exciting hurdle process, this 
becomes 

V t0 (u) = Pr(E t0+l = E t0+2 = ■■■ = E t0+U = 0) 

u 

=n{ i -pt +f>) 

8=1 

= exp<^ -J2 x to+S> 

for u G {1, 2, . . .}, where X to+ $ is calculated assuming no new events have 
occurred since to . The survival function can be used to calculate the expected 
time until the next event day 



E to [T]=E[T-to] = l + Y t V to (u) 



u=X 
oo ( u ~\ 

= l + ^exp<^ -J2 x t +s>- 

u=l I 5=1 J 

The survival function specification of this model can be useful for prediction, 
and making inference about future attacks given the current history. 
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4. Results. In order to build the models and evaluate their predictive 
performance, the data are partitioned into two time periods. The first time 
period from 1994 through 2000 is used to construct the models and estimate 
their parameters. The second period from 2001 through 2007 is used to 
assess the predictive performance of the models. 

4.1. Event day modeling. Recall that the hurdle probability is construc- 
ted from the sum of a baseline and self-exciting processes (6). In order to 
capture potential seasonality and other large scale trends in the data, the 
baseline process B t is defined as 

(9) log(Bt) = /3 + Pit + /3 2 t 2 + Ai sm(27Tt/uj) + A 2 cos(2vrt/w), 

where the seasonal terms have a period of u = 365.25 days. This results 
in five baseline parameters 6 b = {Po, Pi, 02, Ai, A 2 ) € M 5 . The log transform 
ensures Bt > 0. We considered self-exciting processes of the form 

(10) S t {a,fi,r) = a ^ g(t-ti;fi,r), 

i : U <t 

where g(u;/j,,r) is a shifted negative binomial decay function (5) and the 
magnitude a is a constant. This results in three parameters for the self- 
exciting component 6 s = (a,/j,,r) 6 RJL. 

The combined model for Xt = Bt + St results in up to 8 parameters to 
estimate. Estimation is carried out by maximizing the log-likelihood func- 
tion given in (8). Since there is no explicit solution available, we used R's 
numerical optimization routine nlminb [R Development Core Team (2011)] 
to obtain estimates. For all models, the estimates converged within seconds 
from a variety of starting points. 

Table 2 shows the evaluated models and their corresponding AIC scores. 
The lowest AIC belongs to the four component model with a constant base- 
line (SEi), providing essentially a discrete time version of the Hawkes model 
[Hawkes (1971), Ozaki (1979)]. This AIC is much lower than the best base- 
line only model (BL5) which includes the periodic terms. Figure 2 plots the 
weighted if -function for SEi and BL5 showing the self-exciting model is 
much tighter around indicating a better fit over the baseline only model 
[Diggle (1979)]. Figure 4 shows the estimated hurdle probability in the train- 
ing period for both models. 

The parameters of the self-exciting model (SEi) suggest a baseline hurdle 
probability of about 0.012 when the shot noise term drops to zero. However, 
according to the Poisson approximation, every event day will generate an ex- 
pected a = 0.89 additional event days. The generated event day is expected 
to occur in fi = 37.54 days, but the time until occurrence has a median of 
only 16 days. 
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Table 2 
Parameter estimates for the event day models for the training period (1994-2000). 
BL are the baseline only and the SE are the self-exciting models 



The 



Model 


AIC 


0o 


/3i 


02 


Ax 


A 2 


a 


M 


r 


BLi 


1,187.77 


-2.75 
















BL 2 


1,149.54 


-8.2 


8.05 














BL 3 


1,151.25 


-2.83 


-8.01 


11.91 












BL 4 


1,171.22 


-2.82 






-0.42 


-0.31 








BL 5 


1,136.80 


-8.02 


7.69 




-0.35 


-0.32 








BL 6 


1,138.60 


-3.53 


-5.74 


9.95 


-0.35 


-0.32 








SEi 


1,075.21 


-4.41 










0.89 


37.54 


0.45 


SE 2 


1,077.16 


-5.17 


1.23 








0.87 


36.55 


0.45 


SE 3 


1,078.71 


15.31 


-63.2 


50.24 






0.85 


35.69 


0.45 


SE 4 


1,077.03 


-4.34 






-0.41 


-0.40 


0.86 


38.55 


0.44 


SE 5 


1,078.76 


-5.99 


2.68 




-0.38 


-0.42 


0.82 


36.57 


0.43 


SE 6 


1,079.60 


20.46 


-79.33 


62.91 


-0.43 


-0.45 


0.80 


37.28 


0.43 



4.2. Count modeling. It can be obtained from Table 1 that most event 
days (92.4%) are comprised of only 1 or 2 attacks. However, several days 
(1.9%) have more than 9 attacks (with one day having 36 recorded terrorist 
attacks). This suggests count distributions that have an initial rapid decay, 



0.30 



0.25 



o.io 



0.05 



o.oo 




Fig. 4. Hurdle probabilities for the best fitting self-exciting model SE\ (solid line), the 
best fitting baseline only model BLs (dotted line), and the observed piecewise constant 
probabilities (blocks). The rug signifies the event days. 
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but still possess a long tail to accommodate the extreme values. One such 
discrete distribution is the Riemann zeta or discrete Pareto distribution. 
This power law distribution has recently been employed to model the severity 
(e.g., number killed or injured) of terrorist attacks [Clauset, Young and 
Gleditsch (2007)]. 

Using it to model the event day counts, under an i.i.d assumption, gives 
the probability mass function 

ft(y,s) = ^-, ye {1,2,...}, 

C(s) 

where £(s) = Yl°c=i x ~ s 1S the Riemann zeta function and the parameter 
s G (l,oo). This one parameter model is easy to estimate numerically via 
maximum likelihood [Goldstein, Morris and Yen (2004), Seal (1952)] by 

finding the s that satisfies the equation ^ YH=i ^°s(Ui) = ~c(sT' ^e training 
period data gives rise to an estimated parameter of s = 2.86. The bootstrap 
Kolmogorov-Smirnov test of Clauset, Shalizi and Newman (2009) gives a p- 
value of 0.69, suggesting the zeta distribution provides a suitable fit. 

While the constant parameter zeta distribution appears to provide a good 
fit, there may be better models, especially if the distribution is allowed to 
vary over time. In particular, as for the hurdle component, distributions 
that vary over time in response to a shot noise process are considered. An 
additional driving process (with shot noise) was implemented for the count 
model, namely, 

i:ti<t 

where /3 C is a constant baseline, a\ = a c ■ Y{ varies in response to the num- 
ber of attacks on the zth event day, and the decay function g c (u;/i c ) is the 
one parameter shifted geometric function with mean \x c . This results in three 
parameters for the count component 9c = (/3 C , a c , fi c ). Note that this is a sep- 
arate process than the one specified for the hurdle component given by (9) 
and (10). The additional flexibility afforded by the hurdle specification al- 
lows such additional complexity to be introduced for the count component 
of the model. 

There are two primary hypotheses about how the count distribution be- 
haves in response to the event history. First, counts could respond in a self- 
exciting manner, where recent event and multi-event days increase the like- 
lihood of further multi-event days. Alternatively, due to the depletion of 
resources or increased anti-terrorism measures after large attacks, counts 
may respond in a more self-inhibiting manner, where recent multi-event 
days lower the chances of multi-events day in the near future. 

To evaluate the self-exciting hypothesis, let St = (1 — e~ x ^)~ 1 since smaller 
values of s lead to more mass on the extreme values. Alternatively, St = e x *- 
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Table 3 
Parameter estimates for the count models for the training period (1994~2000). C z , C se 
and Csi are the constant parameter, self-exciting and self-inhibiting forms, respectively, 

of the Riemann zeta distribution 



Model 


AIC 




s 


P 


(X 


M 


c 2 


241.00 




2.86 








t~-se 


240.48 


(1- 


-e-**)- 1 


0.375 


0.20 


2.13 


v- .-;/. 


244.72 




e x < 


1.00 


0.34 


55.18 



for the self-inhibiting model. Table 3 shows the estimated parameters and 
AIC scores for the fitted count models. The lower AIC suggests that the 
self-exciting count model has more validity than the self-inhibiting model 
for the training data. Figure 5 shows the estimated probability of counts 
under the constant parameter (C 2 ) and self-exciting (C se ) models. While 
the self-exciting model puts more mass on multi-event days when X£ is 
large, the decay is rapid and the self-exciting behavior is short lived — only 
for a few days, before the shot noise term approaches and its effects become 
negligible. 



4.3. Testing model predictions. Based on the training period, models are 
selected and fit for both the hurdle and count components. To evaluate how 
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Fig. 5. Estimated count probabilities for the constant parameter model (C z with s = 2. 86,) 
and for the self-exciting count model (C se ) at the baseline state (s = 3.19,) and most excited 
state (s = 1.75/ 



SELF- EXCITING HURDLE MODELS FOR TERRORIST ACTIVITY 15 

well this can predict future terrorist activity, the models are applied to 
the terrorist incidents in the testing period (2001-2007) using the following 
procedure: 

(1) At day t predict the probability of an event day and conditional num- 
ber of attacks on day t + 1, based on the current history H t . 

(2) Observe Yf+i and re-estimate the model parameters based on the 
updated history %t+i- 

(3) Repeat steps (1) and (2) through the testing period to construct a fi- 
nal model based on all data between 1994-2007. 

This update and forecast scheme results in a set of daily predictions pt = 
Pr(E t = l\Ht-i) and ft(y) = Pr(Yj = y\E t = l,%t-i) conditioned on the past 
history of the process. 

The predictive capability of the models are evaluated by comparing the 
log probability gain, or predictive log-likelihood ratio, of each model [Daley 
and Vere- Jones (2004)]. This score is a relative measure of the improvement 
in predicative capability for a model compared to a reference or null model. 
The interpretation is that values greater than show a relative improvement 
in predictions over the reference model, with the highest values indicating 
the preferred model. 

For the hurdle component, the log probability gain is 

G=Y^E t \og^ + (l-E t )\og\—^, 

where pt comes from the model of interest and itt from the constant baseline 
only model (BLi), and % = [01 Jan 2001, 31 Dec 2007] covers the testing 
period. Table 4 compares the log probability gains G and the AIC scores 
from the training period, and shows that the results from the test period 
were similar to those in the training period, supporting the use of AIC for 
model selection. 

Table 4 

The hurdle component log probability score, G, using the constant baseline only model 

(BLi) as the reference model. The AIC scores are from the training period (see Table 2) 



Model 


AIC 


G 


Model 


AIC 


G 


BLi 


1,187.77 


0.00 


SEi 


1,075.21 


26.96 


BL 2 


1,149.54 


-19.10 


SE 2 


1,077.16 


25.47 


BL 3 


1,151.25 


20.69 


SE 3 


1,078.71 


24.96 


BL 4 


1,171.22 


-1.39 


SE 4 


1,077.03 


26.22 


BL 5 


1,136.80 


-18.21 


SE 5 


1,078.76 


24.56 


BL 6 


1,138.60 


21.63 


SE 6 


1,079.60 


24.57 
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Table 5 

The count component log probability score, G c . 

using the constant parameter zeta model (C z ) 

as the reference model 



G c 0.00 8.07 -0.27 



The constant baseline self-exciting model (SEi) has the largest log prob- 
ability gain and all self-exciting models outperform all of the baseline only 
models. The best baseline only model for the testing period was the full 
model with seasonality and a quadratic trend (BLg). This is in contrast to 
the training period results where the best baseline only model had a lin- 
ear trend (BL5). The linear trend model failed to perform as well during 
the testing period, as the general attack rate peaked around the transition 
point and started decreasing during the remainder of the testing period, 
creating a significant quadratic trend over the combined observation period. 
Alternatively, the self-exciting models can adapt rapidly to sudden changes 
in attack rate, yielding better overall predictive ability. 

The count component uses the log probability gain score 

i:t t er 2 h (Vi) 

where / is the model of interest and h is the reference model. Table 5 
shows the values of G c using the constant parameter zeta model (C z ) as the 
reference model. These results show the self-exciting zeta model (C se ) as the 
preferred model for predicting the counts. As with the hurdle component, 
the predictive performance of the count models followed the AIC scores from 
the training period. However, the model with the self-exciting component 
does predict substantially better than the constant component model — more 
so than would be expected from the AIC scores in the training period (see 
Table 3). This may be due to changes in the attack behavior during the 
testing period. Figure 1 shows that while there were a few large counts during 
2001, there were very few multi-attack days after. This change in the pattern 
of multiple attacks accounts for the improved predictive capabilities of the 
self-exciting model over the constant parameter model. The self-exciting 
count model is able to quickly adapt to changing conditions and consequently 
makes better predictions. 

In summary, the daily counts of Indonesian terrorist attacks were mod- 
eled with a self-exciting hurdle model. The separation of the model into 
two components allowed both the attack rate and attack characteristics to 
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be more faithfully represented. It was found that including a self-exciting 
process benefited both the hurdle and count components by allowing the 
models to quickly adapt to changing attack behavior. It is this property as 
a predictive tool that makes the self-exciting hurdle model especially useful 
in an applied setting where prediction is of primary interest. 

5. Conclusion. In practice, policy makers are faced with limited resources, 
financially, materially and in personnel. The allocation of these resources 
depends on an understanding of the future risk of terrorist activity. As an 
example, in the wake of the attacks of September 11, 2001, policy makers 
placed additional security in the form of National Guard troops in all US 
commercial airports. This was done in hopes of re-establishing public trust 
in the safety of air travel while the intelligence community was assessing the 
risk of subsequent attacks [Price and Forrest (2009)]. The obvious question 
in that case was when to shift resources from increased airport security to 
investing in long-term strategies to improve intelligence capabilities. As this 
case illustrates, only by clearly understanding the risk of terrorist activity 
can policy makers make informed decisions about allocating resources. 

This paper presents a two-component self-exciting model for the analysis 
and prediction of future terrorist activity. It was found that the best model 
used a constant baseline with a self-exciting term for the hurdle component 
and a Riemann zeta distribution with shot noise driven parameter for the 
count component. The improvement offered by the self-exciting term pro- 
vides support for the contagion theory and suggests a significant short term 
increase in terrorism risk after an attack. The use of a power law distribution 
for the number of same day attacks corresponds with the power law behav- 
ior of the number killed in terrorist attacks [Clauset, Young and Gleditsch 
(2007)]. This is to be expected, as multiple coordinated attacks would tend 
to indicate better planned, and hence more severe, terrorist activities. 

The self-exciting hurdle model adheres to the theoretical concept for a con- 
tagion effect to terrorism as manifested in the clustering of data while pro- 
viding good fit and predictive capabilities without relying on exogenous 
variables. The model provides a simple structure and interpretation of the 
parameters useful for understanding the dynamic nature of the terrorist ac- 
tivity. This provides an appropriate staring point for exploring additional 
covariate effects, including analysis concerning the effectiveness of counter- 
terrorism activities, geography, political or economical factors. As the attack 
clustering may be partially attributable to a stochastic baseline driven by 
such exogenous processes [Holden (1986)], further analysis could include co- 
variates in the baseline model (9) to test the impact on the self-exciting 
component. 

By focusing on the timing of terrorist attacks on the islands of Indonesia, 
other aspects of the attacks, such as location, attack type and group respon- 
sible, were not considered. Including such additional information could lead 
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to more detailed models that better explain the nuances of the terrorist ac- 
tivity. For example, the self-exciting component could include a spatial prox- 
imity term using the models of Mohler et al. (2011) or multiple self-exciting 
terms could be included that represent how attacks from one terrorist group 
influence the subsequent attacks of other groups. 

The utility of the models presented here and their ease of implementation 
and interpretation make them a potentially useful tool in security related 
fields. The results show that the risk of terrorist activity can vary greatly 
over short periods of time, thus policy responses in terms of resource alloca- 
tion, security and counter-terrorism responses should reflect this as well as 
addressing the more long-term trends in risk. For example, understanding 
the short-term variations in risk would allow a more effective deployment and 
assessment of additional airport security measures in the wake of a terrorist 
attack, while a grasp of the longer term trends could help guide the develop- 
ment and assessment of more strategic counter-terrorism resources, such as 
increasing the number of foreign language experts available for translation 
or developing effective de-radicalization programs to prevent the growth of 
terrorist groups. 
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